Sparse Kernel Orthonormalized PLS for feature extraction in large data sets
ثبت نشده
چکیده
In this paper we are presenting a novel multivariate analysis method for large scale problems. Our scheme is based on a novel kernel orthonormalized partial least squares (PLS) variant for feature extraction, imposing sparsity constrains in the solution to improve scalability. The algorithm is tested on a benchmark of UCI data sets, and on the analysis of integrated short-time music features for genre prediction. The upshot is that the method has strong expressive power even with rather few features, is clearly outperforming the ordinary kernel PLS, and therefore is an appealing method for feature extraction of labelled data.
منابع مشابه
Sparse Kernel Orthonormalized PLS for feature extraction in large data sets
We propose a kernel extension of Orthonormalized PLS for feature extraction, within the framework of Kernel Multivariate Analysis (KMVA) KMVA methods have dense solutions and, therefore, scale badly for large datasets By imposing sparsity, we propose a modified KOPLS algorithm with reduced complexity (rKOPLS) The resulting scheme is a powerful feature extractor for regression and classification...
متن کاملKernel PLS-SVC for Linear and Nonlinear Classification
A new method for classification is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by a support vector classifier. Unlike principal component analysis (PCA), which has previously served as a dimension reduction step for discrimination problems, orthonormalized PLS is closely related to Fisher’s approach t...
متن کاملOn the Equivalence between Canonical Correlation Analysis and Orthonormalized Partial Least Squares
Canonical correlation analysis (CCA) and partial least squares (PLS) are well-known techniques for feature extraction from two sets of multidimensional variables. The fundamental difference between CCA and PLS is that CCA maximizes the correlation while PLS maximizes the covariance. Although both CCA and PLS have been applied successfully in various applications, the intrinsic relationship betw...
متن کاملSparse Orthonormalized Partial Least Squares
Orthonormalized partial least squares (OPLS) is often used to find a low-rank mapping between inputs X and outputs Y by estimating loading matrices A and B. In this paper, we introduce sparse orthonormalized PLS as an extension of conventional PLS that finds sparse estimates of A through the use of the elastic net algorithm. We apply sparse OPLS to the reconstruction of presented images from BO...
متن کاملGradient-based kernel method for feature extraction and variable selection
We propose a novel kernel approach to dimension reduction for supervised learning: feature extraction and variable selection; the former constructs a small number of features from predictors, and the latter finds a subset of predictors. First, a method of linear feature extraction is proposed using the gradient of regression function, based on the recent development of the kernel method. In com...
متن کامل